Matching Multi-lingual Subject Vocabularies
نویسندگان
چکیده
Most libraries and other cultural heritage institutions use controlled knowledge organisation systems, such as thesauri, to describe their collections. Unfortunately, as most of these institutions use different such systems, unified access to heterogeneous collections is difficult. Things are even worse in an international context when concepts have labels in different languages. In order to overcome the multilingual interoperability problem between European Libraries, extensive work has been done to manually map concepts from different knowledge organisation systems, which is a tedious and expensive process. Within the TELplus project, we developed and evaluated methods to automatically discover these mappings, using different ontology matching techniques. In experiments on major French, English and German subject heading lists Rameau, LCSH and SWD, we show that we can automatically produce mappings of surprisingly good quality, even when using relatively naive translation and matching methods.
منابع مشابه
Application of Radon Transform in Detecting Turning Angle of Bodies and in Reading Multi - Lingual Documents
Recently, image processing technique and robotic vision are widely applied in fault detection of industrial products as well as document reading. In order to compare the captured images from the target, it is necessary to prepare a perfect image, then matching should be applied. A preprocessing must therefore, be done to correct the samples’ and or camera’s movement which can occur during the...
متن کاملApplication of Radon Transform in Detecting Turning Angle of Bodies and in Reading Multi - Lingual Documents
Recently, image processing technique and robotic vision are widely applied in fault detection of industrial products as well as document reading. In order to compare the captured images from the target, it is necessary to prepare a perfect image, then matching should be applied. A preprocessing must therefore, be done to correct the samples’ and or camera’s movement which can occur during the...
متن کاملN-Gram Language Modeling for Robust Multi-Lingual Document Classification
Statistical n-gram language modeling is used in many domains like speech recognition, language identification, machine translation, character recognition and topic classification. Most language modeling approaches work on n-grams of terms. This paper reports about ongoing research in the MEMPHIS project which employs models based on character-level n-grams instead of term n-grams. The models ar...
متن کاملMulti-lingual Features of the Unified Medical Language System
The Unified Medical Language System (UMLS) is a terminology integration system developed and maintained by the U.S. National Library of Medicine (NLM). Over the past 20 years, the UMLS Metathesaurus has been extended to encompass 168 source vocabularies. While English is the dominant language (116 source vocabularies), 52 vocabularies in the Metathesaurus are in languages other than English (6 ...
متن کاملAn API for Multi-lingual Ontology Matching
Ontology matching consists of generating a set of correspondences between the entities of two ontologies. This process is seen as a solution to data heterogeneity in ontology-based applications, enabling the interoperability between them. However, existing matching systems are designed by assuming that the entities of both source and target ontologies are written in the same languages ( English...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009